Sonic Hedgehog Intron Variant Associated With an Unusual Pediatric Cortical Cataract

Purpose To identify the genetic basis of an unusual pediatric cortical cataract demonstrating autosomal dominant inheritance in a large European–Australian pedigree. Methods DNA from four affected individuals were exome sequenced utilizing a NimbleGen SeqCap EZ Exome V3 kit and HiSeq 2500. DNA from 12 affected and four unaffected individuals were genotyped using Human OmniExpress-24 BeadChips. Multipoint linkage and haplotyping were performed (Superlink-Online SNP). DNA from one affected individual and his unaffected father were whole-genome sequenced on a HiSeq X Ten system. Rare small insertions/deletions and single-nucleotide variants (SNVs) were identified in the disease-linked region (Golden Helix SVS). Combined Annotation Dependent Depletion (CADD) analysis predicted variant deleteriousness. Putative enhancer function and variant effects were determined using the Dual-Glo Luciferase Assay system. Results Linkage mapping identified a 6.23-centimorgan support interval at chromosome 7q36. A co-segregating haplotype refined the critical region to 6.03 Mbp containing 21 protein-coding genes. Whole-genome sequencing uncovered 114 noncoding variants from which CADD predicted one was highly deleterious, a novel substitution within intron-1 of the sonic hedgehog signaling molecule (SHH) gene. ENCODE data suggested this site was a putative enhancer, subsequently confirmed by luciferase reporter assays with variant-associated gene overexpression. Conclusions In a large pedigree, we have identified a SHH intron variant that co-segregates with an unusual pediatric cortical cataract phenotype. SHH is important for lens formation, and mutations in its receptor (PTCH1) cause syndromic cataract. Our data implicate increased function of an enhancer important for SHH expression primarily within developing eye tissues.

I n 2011, we described a large Australian family of European descent that variably presented with four ocular phenotypes: (1) pediatric cortical cataracts (PCCs), (2) familial exudative vitreoretinopathy (FEVR), (3) asymmetric myopia with astigmatism (AM+A), and (4) primary open-angle glaucoma (POAG). 1 Inspection of the pedigree revealed that the PCC phenotype appeared to segregate in an autosomal dominant inheritance pattern, suggesting a single genetic cause.
Herein, we report the identification of a single diseaselinked region and co-segregating haplotype at the telomeric end of chromosome 7. We describe exome and wholegenome sequencing (WGS) approaches that led to the discovery of a predicted highly deleterious novel substitution within intron-1 of the sonic hedgehog signaling molecule (SHH) gene. We illustrate our application of Encyclopedia of DNA Elements (ENCODE) project data that implicated the intronic region as a candidate cis-regulatory element (cCRE) important for SHH gene expression selectively in eye tissues during development. Finally, we show confirmation of enhancer function and the effect of the novel intron variant on gene expression using a cell-based reporter assay. Together, these data implicate eye-specific perturbations in SHH expression during development as the molecular cause of the unusual PCC phenotype observed in this family. Haplotypes were imputed for four unaffected individuals where DNA was unavailable (no dashed circle). Haplotype blocks identified within the disease-linked genomic region are depicted as rectangles below each individual with colors distinguishing different chromosomal runs of SNP marker alleles. A haplotype block fully co-segregating with disease is highlighted (dark red box). In individual IV:2, a recombination event was identified that refined the critical disease interval. Individuals selected for exome (E) and WGS (W) are shown. tion/deletions (indels) were called with the Genome Analysis Toolkit (GATK) 1.6 (https://gatk.broadinstitute.org), and variant filtering/analysis was performed using SNP & Variation Suite (SVS) 8 (Golden Helix, Bozeman, MT, USA; https://www.goldenhelix.com/products/SNP_Variation). An in-house dataset of exome variants identified in 119 individuals with unrelated disease was utilized to filter unrelated variants. Variant allele frequency data were acquired from the Genome Aggregation Database (gnomAD) 2.1.1 release (https://gnomad.broadinstitute.org). 2 Combined Annotation Dependent Depletion (CADD; https://cadd.gs.washington. edu) was utilized to score the deleteriousness of SNVs and indels, and variants with a score under 20 (not in the top 1% deleterious genome variants) were removed from further analysis. 3

Genotyping, Multipoint Linkage Analysis, and Haplotyping
Twelve affected and four unaffected individuals were genotyped at 713,014 single nucleotide polymorphisms (SNPs) using Human OmniExpress-24 1.1 BeadChips (Illumina) (Fig. 1). The 103,311 highest quality and informative markers were selected with GenomeStudio Software (Illumina). Genome-wide, multipoint, parametric linkage analysis was performed utilizing Superlink-Online SNP 1.1 (Technion, Haifa, Israel; http://cbl-hap.cs.technion.ac.il/ superlink-snp). 6 A dominant inheritance model was applied with a disease penetrance of 0.99, and linkage was calculated via a hidden Markov model algorithm using a random subset of approximately 1000 markers per chromosome (23,359 SNPs total). The Superlink-Online SNP software was subsequently employed to identify haplotype blocks for markers located within the disease-linked region. Individual haplotypes were assessed for co-segregation with disease throughout the pedigree. The disease-associated haplotype was reviewed for recombination events in all affected individuals to enable refinement of the critical disease interval.

Assessment of Enhancer Function
Putative enhancer function and the effects of the SHH intron variant were determined using the Dual-Glo Luciferase Assay system (Promega, Madison, WI, USA) according to the manufacturer's protocol. A 296-bp genomic region including the 221-bp putative intronic regulatory element (chr7:155808575-155808796, GRCh38; chr7:155601269-155601490, GRCh37) was PCR amplified from affected individual III-4 using primers incorporating NheI and HindIII restriction endonuclease sites for subsequent cloning (forward, 5 -CTGACTGAGCTAGCGAGGCCGAGGGTTGCTGGAGTTGG-3 ; reverse, 5 -TCAGTCAGAAGCTTCGGCTCGCAGATCAGGG AGGTAGG-3 ). The genomic fragments were ligated into vector pGL4.24 (Promega), upstream of a firefly luciferase reporter gene containing a minimal promoter. Three experimental conditions were tested using constructs including reference, variant, or no putative regulatory element sequence (empty vector). Experimental constructs were transfected into human hepatocellular carcinoma cells (HepG2; American Type Culture Collection, Manassas, VA, USA) in parallel with a Renilla luciferase control plasmid (pGL4.73 [hRluc/SV40]; Promega) for subsequent normalization of transfection efficiencies. Twenty-four biological replicates were performed for each condition. Average response ratios were calculated from experimental versus control luminescence signals relative to the reference condition. Standard errors were calculated, and statistical significance was determined using the paired t-test.

Core Pedigree Identified With an Unusual Dominantly Inherited PCC
We previously reported a detailed description of the family's clinical phenotypes in a 63-person extended pedigree. 1 Herein, we investigated the molecular basis of disease in a 20-person core portion of the pedigree (Fig. 1, Table 1) primarily affected with PCCs, which were observed in 11 individuals (Figs. 2A, 2B). For individual IV-7, we could not describe her cataract features, as she underwent bilateral lensectomies at a young age prior to ascertainment. We classified her as affected due to having FEVR and AM+A clinical features shared with five affected relatives with PCCs. Six individuals were affected with FEVR, of whom five had dragged retinal vessels (Fig. 2C), and another developed a severe retinal detachment (Fig. 2D). Four individuals had POAG, and eight individuals had AM+A of at least -3 diopters (D) in one eye. The pedigree showed autosomal dominant transmission of PCC through all five generations. Segregation of the three additional ocular phenotypes (FEVR, AM+A, and POAG) was less consistent, but we included them as potential comorbidities.

Single Candidate Coding Variant in GALNT11 Excluded
Overall, exome sequencing identified only one rare coding variant shared by all four affected individuals-a c.1788C>T (NM_022087.4) change in the GALNT11 gene that was synonymous (p.C596C). The variant fully co-segregated with the affection status of all core family members. The variant is known (dbSNP rs374094445) but observed at low frequency, with the highest observed in Africans and African Americans (minor allele frequency [MAF], 0.00008). GALNT11 showed the strongest tissue expression in kidney, pancreas, and retina (Genotype-Tissue Expression project), consistent with a role in the eye. Analysis of the synonymous change using ESE Finder predicted destruction of an ESE (serine/arginine rich splicing factor 1 binding site) within exon 12. However, an exon-trapping assay that utilized expression of reference and variant GALNT11 mini-gene constructs in Cos-7 cells determined that the resulting transcript species were spliced identically (Fig. 3) and indicated that the GALNT11 variant was functionally benign.

Disease Locus Identified on Chromosome 7q36
To locate the chromosomal location of the disease-causing gene, we previously performed linkage analyses on the larger family pedigree using 6008 genome-wide SNPs (Linkage Panel IVb, Illumina). 1 Two-point analysis using FastLink software 8,9 and multipoint analysis using MERLIN software 10 revealed multiple SNPs of interest but no statistically significant focal localization. In this study, we employed highdensity BeadChips to genotype 713,014 SNPs in individuals throughout the core pedigree. Following selection of the 103,311 highest quality and informative markers, multipoint linkage analysis was performed on a random subset of 23,359 genome-wide SNPs using Superlink-Online SNP software. This approach identified a single 6.23-centimorgan (cM) support interval located at the telomeric end of chromosome 7q36 with a maximum logarithm of the odds (LOD) score of 3.3 (LODmax, rs6966462; 1 LOD unit support interval, rs1525041 to telomere) (Fig. 4). Haplotype analysis of SNP markers across the linked region identified a single haplotype block co-segregating with disease in all 12 affected individuals and none of the unaffected family members (Fig. 1). A recombination event between markers rs396672 and rs6960275 discovered in individual IV:1 and their son (V:1) refined the centromeric end of the disease interval (chr7:153017006-159138663, GRCh37; chr7:153319921-159345973, GRCh38) to a 6.03-Mbp region containing 21 protein-coding genes.

Novel Noncoding Variant Identified in Intron-1 of SHH
As exome sequencing had not identified the pathogenic variant within the coding regions of the genome, we performed WGS to discover additional variants located within uncaptured coding exons and gene regulatory regions such as promoters and enhancers. Two individuals were chosen for WGS based on their known disease haplotype statusa father lacking the disease haplotype and his son who carried the disease haplotype (II-1 and III-4, respectively) ( Fig. 1). Sequencing these two individuals identified all variants located within the disease haplotype by excluding those detected in the unaffected father from those discovered in the affected son. Variants observed in three unrelated WGS samples were also excluded.
Interestingly, WGS did not identify any rare coding variants located within the disease interval, a tribute to exomecapture methods. As expected, WGS detected many noncoding variants (Table 2). Remarkably, out of the 44 rare noncoding SNVs and indels identified, the CADD algorithm deter- mined only one to be within the top 1% of deleterious variants in the genome (CADD score 21.7), a SNV located within intron-1 of the SHH gene (Fig. 5). The SHH gene expresses two protein-coding and two noncoding transcripts from alternative promoters. The intronic variant was located closest (290 bp) to the first exon of coding transcript-2. The potential pathogenicity of the variant was supported by its novelty, not being observed in more than 152,000 alleles in the gnomAD v3.1.2 database (WGS data). The intron variant was located within a highly conserved region, with the reference nucleotide shared by 74 of 81 vertebrate species with sequence data at this location, suggesting functional significance. The importance of this noncoding region for gene regulation was further implicated by its location within a CpG island and a regulatory element predicted by the ENCODE project. 11,12 Inspection of ENCODE data within the SCREEN Registry (version 3; https://screen.wenglab.org/) showed the intronic variant was located within a 221-bp candidate cCRE with a distal enhancer-like signature (dELS; EH38E2603698)  . ENCODE data predict the variant is located within a distal cCRE important for eye development. Combined cell data of signals for DNase hypersensitivity (green), H3K4me3 (red) and H3K27ac (yellow) histone modifications, and CTCF binding (blue) implicate a 221-bp cCRE (yellow bar with associated data boxed in blue) distal to SHH transcriptional start sites. Highest DNase hypersensitivity signals were derived from embryonic eye and retina tissues (lower four green plots). The location of the novel deleterious intronic variant is denoted (red arrow/dotted line). (Fig. 6). The ENCODE project identifies and classifies potential regulatory elements according to four biochemical signatures obtained from human cell lines and tissues. Primarily the region must be in an open chromatin state, making the site DNase hypersensitive, which can be determined by DNase-seq experiments. Secondary support is provided by chromatin immunoprecipitation sequencing (ChIP-Seq) experiments that identify histone modifications (H3K4me3 and H3K27ac) and CCCTC-binding factor (CTCF) binding that are important for regulating chromatin state and gene expression. The intron variant was within a cCRE with high signals for all four biochemical signatures when data were combined from all ENCODE biosamples (maximum z-scores: DNase, 3.31; H3K4me3, 4.59; H3K27ac, 4.28; CTCF, 2.98). Furthermore, out of 1118 human tissues profiled, the top two highest DNase-seq signals detected at the cCRE element were derived from tissues of 76-day female embryonic eye (z-score, 3.31) and 56-day male embryonic eye (z-score, 3.18). Retina tissue from a 74-day embryo provided the fifth highest DNase-seq signal (z-score, 3.14), and an 89-day female embryonic retina tissue was 5th (z-score, 2.96). Taken together, these data provide strong evidence to support the specific importance of the cCRE during eye development.

Novel Intronic Variant Effects a SHH Distal Enhancer
To test whether the cCRE could influence gene expression and to determine the effect of the novel intronic variant, we performed luciferase reporter assays, which placed the reference and variant element sequences upstream of a firefly luciferase gene with a minimal promoter. For an enhancer to function effectively, transcription factors (TFs) are required that bind to the enhancer sequence and aid in the recruitment of RNA polymerase II to the promoter. Experimental data from the ENCODE project showed that 19 different TFs, including CTCF, bind to the cCRE sequence, but artificially creating such molecular environmental conditions in cultured cells would have proven problematic. Further interrogation of the ENCODE data showed that, out of 28 cell lines profiled, the highest biochemical signals observed for all four enhancer signatures were identified in a commercially available human liver carcinoma cell line, HepG2 (zscores: DNase, 3.19; H3K4me3, 4.05; H3K27ac, 3.08; CTCF, -10.00). Consequently, we employed HepG2 cells to endogenously provide the TFs critical for enhancer function in our luciferase reporter experiments.
The luciferase assay results (Fig. 7) showed that the presence of either the reference or variant cCRE sequence greatly increased the levels of reporter expression, confirm-ing the intronic element as a legitimate enhancer. Interestingly, compared to the reference enhancer, the variant form caused significant overexpression of the reporter gene. Together, these data support altered SHH gene expression as a functional consequence of the intronic variant.

DISCUSSION
SHH is the most widely studied ligand of the hedgehog signaling pathway and a well-known classical morphogen, inducing embryonic cell targets to differentiate into specific cellular identities according to the precise levels and duration of its signaling. 13 The mechanism of SHH signaling can be described as "double-negative" activation. 14 In the presence of SHH ligand, its binding to the Patched (PTCH) transmembrane receptor relieves repression on a second transmembrane protein, Smoothened (SMO). 13 Activated SMO then initiates intracellular signal transduction resulting in GLI family transcription factor processing into their active forms (GLIA) and subsequent transcription of SHH target genes. In the absence of SHH signal, the PTCH receptor inhibits SMO, and the GLI proteins are either degraded or processed into transcriptional repressors (GLIRs). GLItargeted genes include PTCH1 itself, thus creating a negative feedback loop that represses SHH pathway activation through limiting ligand dispersion and signal transducer activity. 15 SHH can signal in an autocrine fashion, affecting the cells in which it is produced, or it can act as a paracrine signal to induce changes in other cells. Further regulation of paracrine signaling is provided by the Dispatched (DISP) protein, which facilitates the extracellular secretion of SHH. 16 The action of SHH in defined concentration gradients and spatiotemporal patterns is crucial for organizing many developing tissues, including the eye, craniofacial structures, brain, spinal cord, and limbs. 17 In addition, SHH signaling regulates adult stem cells involved in the maintenance and regeneration of adult tissues. 18 In humans, mutations in SHH have been reported to cause a wide range of developmental abnormalities. The severe end of the disease spectrum includes massive brain and ocular malformations, such as holoprosencephaly (OMIM 236100), where the forebrain fails to separate into two hemispheres, accompanied by cyclopia, anophthalmia, or microphthalmia. 19,20 At the mild end of the disease spectrum, SHH mutations have resulted in more subtle effects on specific eye components, such as isolated iris and uvealretinal coloboma, with or without microphthalmia. [21][22][23] SHH signaling plays a crucial role in eye vesicle patterning in vertebrates. SHH promotes expression of PAX2 in the optic stalk and represses expression of PAX6 in the optic cup. 24 SHH signaling contributes to establishment of both proximal-distal and dorsal-ventral axes by activating VAX1, VAX2, and PAX2. 25 In the dorsal part of the developing retina, BMP4 is expressed and antagonizes the ventralizing effects of SHH signaling. 26 In human embryos, SHH expression has been demonstrated in the developing optic vesicle, superficial lens, and posterior retina. 27 SHH is also important for the differentiation of periocular neural crest (PNC) cells, which are indispensable for anterior chamber development. 20 PNCs interact with the forming cornea and lens and define the edge of the optic cup from which the ciliary body, Schlemm's canal, and iris are formed. 28 Studies support an association between abnormal SHH signaling and the PCC phenotype observed in the family described here. Although there do not appear to be reports of nonsyndromic cataracts caused by SHH mutations in the literature, the SHH pathway is important for lens development. The SHH ligand, PTCH receptor, and BMP4 antagonist are all expressed in the developing lens. 27 Congenital cataracts are found in 18% of patients with Gorlin-Goltz syndrome, which is caused by mutations in the SHH receptor gene PTCH1. 29 BMP4 is expressed strongly in the optic vesicle and weakly in the surrounding mesenchyme and surface ectoderm, where it has crucial roles during lens induction. 30 SHH signaling plays a key role in cataract development and in the response of normal lenses to radiation injury. Mice heterozygous for Ptch1 develop spontaneous cataracts and are highly susceptible to cataract induction by exposure to ionizing radiation at an early postnatal age, when lens epithelial cells undergo rapid expansion in the lens epithelium. 31 FEVR is characterized by variable degrees of avascular peripheral retina, extraretinal neovascularization, macular dragging, and retinal detachment. This phenotype was observed in six of the 12 (50%) individuals carrying the SHH variant and was not detected in four family members without the variant. We are unable to provide further genetic evidence to support a causative relationship between the SHH variant and this phenotype but include it for the purpose of providing information regarding FEVR as a possible comorbidity. It should be noted that the SHH pathway is an important component of normal retinal angiogenesis with activation of Shh leading to an increase in angiogenic factors (angiopoietin-1 and angiopoietin-2), 32 and SHH blockade inhibiting vascular neogenesis. 33 Mutations in the PTCH1 gene can lead to an abnormal intraretinal glial response that stimulates the retinal surface to proliferate and contract, creating retinal membranes, dragged vessels, and macular holes in patients with Gorlin-Goltz syndrome. 29,34-37 Similar abnormalities are exhibited in mice lacking a Ptch1 allele, wherein abnormal vitreoretinal cell cycle regulation leads to photoreceptor dysplasia and Müller cell-derived gliosis. 38 Funnel retinal detachment has also been reported in patients with BMP4 mutations. 27 The AM+A phenotype was observed in 73% (8/11) of assessed SHH variant carriers and is likely a secondary result of the cataracts distorting the focus of the lens. Due to the childhood onset of the cataracts, they may also affect visual cues involved with retinoscleral signaling during emmetropization, leading to abnormal scleral remodeling of ocular shape and further refractive error. The environment is a well-established modifier of refractive error status, which may also explain the reduced penetrance of this potential comorbidity. It is worth noting that myopia appears to be common in patients with holoprosencephaly, with one study observing five of 10 patients (50%) with moderate myopia. 39 Myopia is also commonly observed in Gorlin-Goltz syndrome, with one report describing nine of 11 patients (82%) affected by various degrees of myopia, ranging from −0.5 to −10 D. 29 Two patients showed high anisometropia (18%), with a 6-D difference in one patient and 10-D difference in the other; only two patients (18%) were emmetropes. In guinea pigs, the SHH signaling pathway has been shown to induce myopia by activating matrix metalloproteinase 2. 40 In chicks with experimentally induced myopia, SHH expression is increased in the retina, which suggests its involvement in the retinoscleral feedback that controls postnatal eye growth. 41 POAG affects pedigree members over 40 years of age, so we cannot ascribe affection status to younger family members. This limits the genetic evidence to support an association between POAG and the SHH variant identified in this family. Currently, four of the five (80%) SHH variant carriers over the age of 40 have developed POAG, and we expect that more younger carriers will develop glaucoma as they age (Table 1). SHH has not previously been associated with glaucoma, but the involvement of the pathway in the differentiation of PNC cells, which are important for anterior chamber development and subsequent ciliary body, Schlemm's canal, and iris formation, make it an obvious candidate. 20,28 Subtle developmental defects in these structures could lead to late-onset failure of aqueous outflow resulting in POAG. SHH also plays a role in early retinal ganglion cell (RGC) development, as it is secreted by differentiated RGCs to induce its own expression and promote the differentiation of retinal precursor cells into further RGCs. [42][43][44] Additionally, the precise regulation of SHH levels inside the retina is critical, as it stimulates RGC axon growth at low concentrations but inhibits growth at high concentrations. 45 The SHH enhancer affected in this family appears to be important primarily during eye tissue development, which explains the lack of non-ocular phenotypes normally associated with mutations in the SHH gene or its molecular partners. There are reported examples of other genes associated with both major ocular developmental abnormalities and non-syndromic childhood or adult cataract. Mutations in the SOX2 gene are known to account for 20% of anophthalmia and microphthalmia in humans, but markers in proximity to the SOX2 gene and within its regulator, SOX2-OT, have recently been associated with cataract. [46][47][48] Likewise, PAX6 is well known as a master regulator of eye formation in which mutations have largely been associated with aniridia, but mutations in this gene have also been identified in families with congenital cataracts. 49,50 Many long-and short-range enhancers are likely to be selectively used to control the exact concentration and spatiotemporal patterns of SHH expression. Importantly, data from the ENCODE project has implicated the distal enhancer presented in this study as being selectively important for human embryonic eye development. Furthermore, our cell-based assay data have shown that the novel intron variant identified in this family causes enhanced gene expression. As SHH is known to be important for the development of the lens in humans, for which precise gene expression is critical, we propose that the novel intronic variant is responsible for the unusual PCC phenotype observed in this unique family. FEVR, AM+A, and POAG were not observed in all individuals carrying the SHH variant, which may be due to incomplete penetrance. However, further genetic evidence will be required to confirm they can also be caused by SHH mis-expression in the eye.
In conclusion, this family report and genetic study expands the ocular phenotypic expression of mutations associated with the SHH gene to include PCC. Based on our findings, we recommend investigation of genes within the SHH pathway in families with cataract. The SHH pathway should also be considered in families with glaucoma, myopia, and retinal vasculopathy phenotypes.